[TG-9002] Symex: revert guards after call return #5003

smowton · 2019-08-09T16:43:24Z

Executing a function may have a cumulative effect on the state guard. For example, if the callee contained ASSUME statements that rendered one or more control-flow options unviable then the guard might still embody that restriction (i.e. for if(x) ASSUME(false) the guard might still be !x). However, on function return we know that all control-flow paths have converged or been shown to be unviable, so we can restore the simpler guard as it was when we entered the callee function.

Exceptions:
(a) if the guard is false it would be correct but inefficient to restore it; keep it false until we find a convergeance with another viable path
(b) if we're doing path-sensitive symex then we do tail duplication, and there are no control-flow convergeances. Keep the guard as-was.
(c) if we're executing a multi-threaded program then symex_assume_l2 uses the guard to accumulate assumptions, which we must not discard.

In truth this optimisation is applicable whenever a block postdominates another, but function structure gives us a cheap way to establish postdominance without analysis (in the absence of setjmp/longjmp at least)

smowton · 2019-08-09T16:45:33Z

Poster-child example courtesy of @hannes-steffenhagen-diffblue:

#include <assert.h>
#include <string.h>
#include <stdlib.h>


char *nondet_str(void) {
  const char empty[] = "                               ";
  char *result = strdup(empty);
  int x;
  __CPROVER_assume(x >= 0 && x <= 9);
  result[30] = (char)('0'+x);
  return result;
}

int main(void) {
  long x = 2;
  const char *sx = "                              2";
  for(int i = 0; i < 10; ++i) {
    assert(strtol(nondet_str(), NULL, 10) == x);
  }
  return 0;
}

This previously got bogged down in an ever-growing guard, I think because the guard |= operator didn't quite manage to squash the guard back down so that it left strtol in the same state as it entered, and so the run slowed ever down and down. With this patch that example takes around 5 seconds to complete on my machine.

kroening · 2019-08-10T09:27:41Z

Guards getting fat & monstrous has bugged me for many years -- and I believe it's a major performance problem.
I agree it's probably not sensible to try to solve this exactly, but that cheap, syntactic stuff is the way to go.
Other ideas like "function body"?

kroening · 2019-08-10T09:28:29Z

Another BTW: I think that path-pased symex suffers from this just as much.

smowton · 2019-08-10T14:44:17Z

@kroening I don't follow "function body"?

smowton · 2019-08-10T14:58:26Z

SVCOMP/Java: 3 tests stop timing out (timeout = 30 seconds), overall CPU time down 5%

smowton · 2019-08-10T15:02:06Z

SVCOMP/C subsample: 3 tests stop timing out (timeout = 30 seconds), total CPU time down 2%

smowton · 2019-08-12T10:22:30Z

CI failure: this harms performance for cases with many conditional ASSUME false; instructions in one function -- previously these would be ignored for guard purposes and the unviable instructions executed regardless, which was wasteful but lead to the guard being simplified more effectively. This is especially punitive with BDD guards.

Note this already happens now when paths are truncated by falsifying the guard -- for example exceeding depth bound, or the (specifically recursion) unwind bound.

Proposed course of action: merge this PR with the guard reversion but not the path truncation cleanup, then separately pursue guard shrinkage when paths are truncated, then make path truncation uniform so that we don't execute instructions after an ASSUME false.

smowton · 2019-08-12T16:55:28Z

In view of this I've removed the cleanup of ASSUME vs. guard manipulation from this PR and only make the on-function-return guard simplification. I'm re-running my benchmarks on this modified version and expect CI to pass here.

smowton · 2019-08-13T09:28:07Z

Current status: causes about 10 SVCOMP/Java tests to stop timing out and instead return incorrect answers. Looking into it...

codecov-io · 2019-08-13T09:49:49Z

Codecov Report

Merging #5003 into develop will decrease coverage by 0.04%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           develop    #5003      +/-   ##
===========================================
- Coverage    69.47%   69.43%   -0.05%     
===========================================
  Files         1310     1310              
  Lines       108763   108741      -22     
===========================================
- Hits         75566    75502      -64     
- Misses       33197    33239      +42

Impacted Files	Coverage Δ
src/goto-symex/symex_main.cpp	`86.83% <100%> (+0.04%)`	⬆️
src/goto-symex/frame.h	`100% <100%> (ø)`	⬆️
src/goto-symex/symex_goto.cpp	`94.91% <100%> (+0.02%)`	⬆️
src/goto-symex/symex_function_call.cpp	`86.8% <100%> (+0.18%)`	⬆️
src/goto-symex/call_stack.h	`100% <100%> (ø)`	⬆️
src/goto-symex/goto_symex_state.cpp	`91.72% <100%> (ø)`	⬆️
src/solvers/prop/bdd_expr.cpp	`60.97% <0%> (-29.57%)`	⬇️
src/solvers/bdd/bdd_miniBDD.h	`84.61% <0%> (-15.39%)`	⬇️
src/solvers/bdd/miniBDD/miniBDD.inc	`93.47% <0%> (-6.53%)`	⬇️
src/solvers/strings/string_builtin_function.h	`75.26% <0%> (-2.16%)`	⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb6baff...66133a6. Read the comment docs.

smowton · 2019-08-13T13:49:48Z

Conclusion: reintroduced those parts of the cleanup that ensure an ASSUME / real constraint is generated wherever we truncate execution due to the unwind or depth limits, as otherwise using the dominant guard would actually mean broadening the guard by re-including impossible paths. This PR now (a) ensures we generate a real constraint whenever we truncate a path, and (b) uses guard-as-of-entering-function whenever we exit that function, possibly broadening it by including paths that have been ruled out and/or simplifying it better than the expression-simplifier is capable of.

Forthcoming work will separate the concepts of guards and reachability, such that we can reduce the execution guard at control-flow convergances even when one of the predecessors is unreachable, while maintaining the benefit of not generating constraints for unreachable code.

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: 9aec581).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/122999685

smowton · 2019-08-14T10:46:00Z

Re-benchmarked with neutral results: all I claim now is that specific examples where the guard gets out of hand, like Hannes' example above, become analysable with this change but were previously impractically slow.

peterschrammel · 2019-08-12T10:09:23Z

regression/cbmc/assume-false/test.desc

+main::1::i!0@1#[0-9]+ = 2
+--
+This checks that code rendered unreachable by a __CPROVER_assume(0) does not generate constraints.
+


redundant blank line

regression/cbmc/residual-guards-1/test_execution.desc

regression/cbmc/residual-guards-2/test_execution.desc

regression/cbmc/residual-guards-3/test_execution.desc

regression/cbmc/residual-guards-4/test_execution.desc

Executing a function may have a cumulative effect on the state guard. For example, if the callee contained ASSUME statements that rendered one or more control-flow options unviable then the guard might still embody that restriction (i.e. for if(x) ASSUME(false) the guard might still be `!x`). However, on function return we know that all control-flow paths have converged or been shown to be unviable, so we can restore the simpler guard as it was when we entered the callee function. Exceptions: (a) if the guard is false it would be correct but inefficient to restore it; keep it false until we find a convergeance with another viable path (b) if we're doing path-sensitive symex then we do tail duplication, and there are no control-flow convergeances. Keep the guard as-was. (c) if we're executing a multi-threaded program then symex_assume_l2 uses the guard to accumulate assumptions, which we must not discard. In truth this optimisation is applicable whenever a block postdominates another, but function structure gives us a cheap way to establish postdominance without analysis (in the absence of setjmp/longjmp at least)

Whenever we truncate a particular execution path, whether due to --unwind or --depth settings, we should generate an assumption so that it is not necessary to record the rejected path in the state guard. For example, before if we exceeded the depth limit then we would set the guard to false, and rely on carrying this missing path forwards to any future assume or assert statement. By emitting an assume we explicitly make a constraint ruling this path out, meaning that downstream guards no longer need to carry that information (for example, where before we might have had guard `x` and VCCs `!x => y` and `!x => z`, we can now emit a constraint `!x`, restore the guard to `true` and produce VCCs `y` and `z`). Note this commit does not actually change when the guard is set false and when it is not, as this will have additional performance consequences and should be benchmarked independently.

allredj

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 5dbbf29).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123209903
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

allredj

✔️
Passed Diffblue compatibility checks (cbmc commit: 66133a6).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123215484

smowton requested review from kroening, peterschrammel and tautschnig as code owners August 9, 2019 16:43

smowton force-pushed the smowton/feature/revert-guards-after-calls branch from e7525ef to a1a7b6c Compare August 9, 2019 16:56

smowton requested review from JohnDumbell, romainbrenguier and allredj August 9, 2019 16:56

smowton force-pushed the smowton/feature/revert-guards-after-calls branch from a1a7b6c to 9c9c883 Compare August 10, 2019 20:58

smowton changed the title ~~Symex: revert guards after call return~~ [TG-9002] Symex: revert guards after call return Aug 12, 2019

smowton force-pushed the smowton/feature/revert-guards-after-calls branch 2 times, most recently from 0911519 to ccf6391 Compare August 13, 2019 08:44

allredj reviewed Aug 13, 2019

View reviewed changes

peterschrammel approved these changes Aug 14, 2019

View reviewed changes

smowton added 2 commits August 14, 2019 12:03

smowton force-pushed the smowton/feature/revert-guards-after-calls branch from 91636b4 to 66133a6 Compare August 14, 2019 11:03

allredj reviewed Aug 14, 2019

View reviewed changes

kroening self-assigned this Aug 14, 2019

kroening merged commit 60b2851 into diffblue:develop Aug 15, 2019

[TG-9002] Symex: revert guards after call return #5003

[TG-9002] Symex: revert guards after call return #5003

Uh oh!

Conversation

smowton commented Aug 9, 2019

Uh oh!

smowton commented Aug 9, 2019

Uh oh!

kroening commented Aug 10, 2019

Uh oh!

kroening commented Aug 10, 2019

Uh oh!

smowton commented Aug 10, 2019

Uh oh!

smowton commented Aug 10, 2019

Uh oh!

smowton commented Aug 10, 2019

Uh oh!

smowton commented Aug 12, 2019

Uh oh!

smowton commented Aug 12, 2019

Uh oh!

smowton commented Aug 13, 2019

Uh oh!

codecov-io commented Aug 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

smowton commented Aug 13, 2019

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

smowton commented Aug 14, 2019

Uh oh!

peterschrammel Aug 12, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

allredj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-io commented Aug 13, 2019 •

edited

Loading